Goto

Collaborating Authors

 log-concave distribution


Robust learning of halfspaces under log-concave marginals

Neural Information Processing Systems

We say that a classifier is adversarially robust to perturbations of norm r if, with high probability over a point xdrawn from the input distribution, there is no point within distance rfrom xthat is classified differently. The boundary volume is the probability that a point falls within distance r of a point with a different label. This work studies the task of computationally efficient learning of hypotheses with small boundary volume, where the input is distributed as a subgaussian isotropic log-concave distribution over Rd. Linear threshold functions are adversarially robust; they have boundary volume proportional to r. Such concept classes are efficiently learnable by polynomial regression, which produces a polynomial threshold function (PTF), but PTFs in general may have boundary volume โ„ฆ(1), even for r 1. We give an algorithm that agnostically learns linear threshold functions and returns a classifier with boundary volume O(r+ฮต)at radius of perturbation r.


Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures

Neural Information Processing Systems

The canonical approach in generative modeling is to split model fitting into two blocks: define first how to sample noise (e.g. Gaussian) and choose next what to do with it (e.g. using a single map or flows). We explore in this work an alternative route that ties sampling and mapping. We find inspiration in moment measures, a result that states that for any measure $\rho$, there exists a unique convex potential $u$ such that $\rho=\nabla u \sharp e^{-u}$. While this does seem to tie effectively sampling (from log-concave distribution $e^{-u}$) and action (pushing particles through $\nabla u$), we observe on simple examples (e.g., Gaussians or 1D distributions) that this choice is ill-suited for practical tasks. We study an alternative factorization, where $\rho$ is factorized as $\nabla w^* \sharp e^{-w}$, where $w^\ast$ is the convex conjugate of a convex potential $w$. We call this approach conjugate moment measures, and show far more intuitive results on these examples. Because $\nabla w^*$ is the Monge map between the log-concave distribution $e^{-w}$ and $\rho$, we rely on optimal transport solvers to propose an algorithm to recover $w$ from samples of $\rho$, and parameterize $w$ as an input-convex neural network. We also address the common sampling scenario in which the density of $\rho$ is known only up to a normalizing constant, and propose an algorithm to learn $w$ in this setting.



An Exploration-by-Optimization Approach to Best of Both Worlds in Linear Bandits

Neural Information Processing Systems

In this paper, we consider how to construct best-of-both-worlds linear bandit algorithms that achieve nearly optimal performance for both stochastic and adversarial environments. For this purpose, we show that a natural approach referred to as exploration by optimization [Lattimore and Szepesvรกri, 2020b] works well.


Beyond Log-concavity: Provable Guarantees for Sampling Multi-modal Distributions using Simulated Tempering Langevin Monte Carlo

Neural Information Processing Systems

A key task in Bayesian machine learning is sampling from distributions that are only specified up to a partition function (i.e., constant of proportionality). One prevalent example of this is sampling posteriors in parametric distributions, such as latent-variable generative models. However sampling (even very approximately) can be #P-hard. Classical results (going back to Bakry and Emery) on sampling focus on log-concave distributions, and show a natural Markov chain called Langevin diffusion mix in polynomial time. However, all log-concave distributions are uni-modal, while in practice it is very common for the distribution of interest to have multiple modes.